Uddipto Dutta

Bangalore Urban district, India

4K followers 500+ connections

View mutual connections with Uddipto

Welcome back

Email or phone

Password

Forgot password?

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Join to view profile

Mashreq

Indian Statistical Institute, Kolkata

About

Data Scientist with 6 years of experience in product, retail, manufacturing and finance…

Activity

अंजनि पुत्र महाबलदायी। सन्तन के प्रभु सदा सहाई ।। दे बीरा रघुनाथ पठाए। लंका जारि सिया सुध लाए अर्थ- माता अंजनी के पुत्र श्री हनुमान जी महाबली होने…

अंजनि पुत्र महाबलदायी। सन्तन के प्रभु सदा सहाई ।। दे बीरा रघुनाथ पठाए। लंका जारि सिया सुध लाए अर्थ- माता अंजनी के पुत्र श्री हनुमान जी महाबली होने…

Liked by Uddipto Dutta
🌟 Dream Big 🌟 Today, as I take the next step in my career and join Google as a Data Scientist, it still feels surreal. My journey from Mu Sigma…

🌟 Dream Big 🌟 Today, as I take the next step in my career and join Google as a Data Scientist, it still feels surreal. My journey from Mu Sigma…

Liked by Uddipto Dutta
Meditation is essentially about connecting with a timeless divine reality that resides and presides beyond the mundane reality, which is constantly…

Meditation is essentially about connecting with a timeless divine reality that resides and presides beyond the mundane reality, which is constantly…

Liked by Uddipto Dutta

Join now to see all activity

Experience

Mashreq

Bengaluru, Karnataka, India
-

Bengaluru, Karnataka, India
-

Bengaluru, Karnataka, India
-

Bengaluru, Karnataka, India
-

Bengaluru, Karnataka, India
-

Pune Area, India
-

Bengaluru, Karnataka, India
-

Kolkata Area, India

Education

Indian Statistical Institute, Kolkata

2016 - 2018

Activities and Societies: Actively participated in cricket, football and table tennis tournaments

Master's degree in the application of probability and statistics in real life problems. Course included relevant subjects on machine learning, statistical modeling, multivariate data analysis, probability and distributions, stochastic processes, reliability and statistical quality control.
2011 - 2015

Activities and Societies: Part of the department football team

Course included relevant subjects on Electronics and Communication engineering like analysis of signals using variety of techniques like Laplace Transformations, Fourier Transformations etc. Also contained subjects on JAVA and Data Structures.
2009 - 2011

Activities and Societies: Playing cricket and football

Majored in Maths, Physics, Chemistry and Computer Science with particular interest in Calculus, Inequalities and Coordinate Geometry.
1999 - 2009

Activities and Societies: Playing cricket, football and participating in plays

Consistent performer in both science and literature based subjects, especially in Mathematics.

Licenses & Certifications

Grokking the Coding Interview: Patterns for Coding Questions

Educative, Inc.

Issued Aug 2022

Credential ID 9pgPElKDD0whxNE3Zxkw2MhZ2RPzOPRZVFN

See credential
Grokking the Machine Learning Interview

Educative, Inc.

Issued Aug 2022

Credential ID Y6GKZ1ijkL8DWrvqEFj4lMnVMljKFJ

See credential
Certificate of Appreciation

upGrad

Issued Dec 2020
Deep Learning: Advanced NLP and RNNs

Udemy

Issued Feb 2020

Credential ID UC-447d08c8-a23d-43e3-9c6f-51d51d2f419b

See credential
Introduction to Artificial Neural Network and Deep Learnin

Udemy

Issued Feb 2020

Credential ID UC-faf66ffd-7539-4eaO-8206-001804d32b21

See credential

Publications

From Pixels to Words: A Scalable Journey of Text Information from Product Images to Retail Catalog

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management October 30, 2021
Extracting texts of various shapes, sizes, and orientations from images containing multiple objects is an important problem in many contexts, especially, in connection to E-commerce. In the context of the scale at which Walmart operates, the text from an image can be a richer and more accurate source of data than human inputs and can be used in several applications such as Attribute Extraction, Offensive Text Classification, Product Matching among others. The motivation of this particular work…

Extracting texts of various shapes, sizes, and orientations from images containing multiple objects is an important problem in many contexts, especially, in connection to E-commerce. In the context of the scale at which Walmart operates, the text from an image can be a richer and more accurate source of data than human inputs and can be used in several applications such as Attribute Extraction, Offensive Text Classification, Product Matching among others. The motivation of this particular work has come from different business requirements such as flagging products whose images contain words that are non-compliant with organizational policies and building an efficient automated system to identify similar products by comparing the information contained in their respective product images and many others. Existing methods fail to address domain specific challenges like high entropy, different orientations, and small texts in product images adequately. In this work, we provide a solution that not only addresses these challenges but is also proven to work at a million image scale for various retail business units within Walmart. Extensive experimentation revealed that our proposed solution has been able to save around 30% computational cost in both the training and the inference stages.

Other authors
See publication
A mean shift adjusted relative error metric to forecast accuracy measurements

24th International Conference on Computational Statistics (COMPSTAT 2021)
Measuring forecast accuracy in an appropriate and robust manner is one of the key components of time series forecasting. Time series data often contain a series of uncharacteristically low or uncharacteristically high values, also known as mean shifts, arising due to some external factors. Measures like MAE, MAPE, MASE etc. are not able to consider the presence of such mean shifts in the validation period and whether the impact of such mean shifts is likely to persist beyond the validation…

Measuring forecast accuracy in an appropriate and robust manner is one of the key components of time series forecasting. Time series data often contain a series of uncharacteristically low or uncharacteristically high values, also known as mean shifts, arising due to some external factors. Measures like MAE, MAPE, MASE etc. are not able to consider the presence of such mean shifts in the validation period and whether the impact of such mean shifts is likely to persist beyond the validation period. We attempted to address these shortcomings of the existing measures by proposing a new measure which is able to account for such shifts in the overall pattern of the data by giving appropriate weight to the relative absolute difference between the predicted and actual value for each observation in the validation period. The proposed measure is also - (1) scale independent, and hence can be compared across series, (2) immune to producing infinite values in the presence of zero values in the validation period. Experimental results obtained by calculating the measure on data with such mean shifts, show that the proposed measure is a much closer reflection of model performance compared to the existing measures.

Other authors
See publication
Contextual Transformation of Short Text for Improved Classifiability

https://vixra.org/
Text classification is the task of automatically sorting a set of documents into predefined
set of categories. This task has several applications including separating positive and negative
product reviews by customers, automated indexing of scientific articles, spam filtering and
many more. What lies at the core of this problem is to extract features from text data
which can be used for classification. One of the common techniques to address this problem
is to represent text data…

Text classification is the task of automatically sorting a set of documents into predefined
set of categories. This task has several applications including separating positive and negative
product reviews by customers, automated indexing of scientific articles, spam filtering and
many more. What lies at the core of this problem is to extract features from text data
which can be used for classification. One of the common techniques to address this problem
is to represent text data as low dimensional continuous vectors such that the semantically
unrelated data are well separated from each other. However, sometimes the variability along
various dimensions of these vectors is irrelevant as they are dominated by various global
factors which are not specific to the classes we are interested in. This irrelevant variability
often causes diculty in classification. In this paper, we propose a technique which takes the
initial vectorized representation of the text data through a process of transformation which
amplifies relevant variability and suppresses irrelevant variability and then employs a classifier
on the transformed data for the classification task. The results show that the same classifier
exhibits better accuracy on the transformed data than the initial vectorized representation of
text data.

Other authors
See publication
Does noise hinder or boost time-series forecasting: A theoretical and empirical analysis

24th International Conference on Computational Statistics (COMPSTAT 2021)
Accurate time-series forecasting is at the core of many important practical applications. Although extensive research has been carried out in this domain, different aspects of this problem are still left to be analysed from many different perspectives. In this paper, we focused on analysing the impact of noise in time-series forecasting. It is a common notion that noise in training data deteriorates the accuracy of forecasting models. However, our theoretical analysis shows that injection of…

Accurate time-series forecasting is at the core of many important practical applications. Although extensive research has been carried out in this domain, different aspects of this problem are still left to be analysed from many different perspectives. In this paper, we focused on analysing the impact of noise in time-series forecasting. It is a common notion that noise in training data deteriorates the accuracy of forecasting models. However, our theoretical analysis shows that injection of noise into the original time-series data, with certain constraints on the mean and variance of the noise-distribution, helps to improve the accuracy of the forecasting models. The accuracy of the model starts worsening when values of the noise-parameters lie outside these constraints. We also carried out analyses to estimate an approximate theoretical bound on various parameters of the noise distribution in order to identify the most optimal operating region of the forecasting model. We explored different types of neural-network based forecasting models on time-series data obtained from different practical applications in our empirical study. The observed behaviour of all these forecasting models on different kinds of time-series data are in agreement with our theoretical findings.

Other authors
See publication
Drift-adjusted and arbitrated ensemble framework for time series forecasting

International Symposium on Forecasting (ISF)
Time-series Forecasting is at the core of many practical applications such as sales forecasting for business and many others. Though this problem has been extensively studied for years, it is still considered a challenging problem due to complex and evolving nature of time-series data. Typical methods proposed for time-series forecasting modelled linear or non-linear dependencies between data observations. However it is a generally accepted notion that no one method is universally effective for…

Time-series Forecasting is at the core of many practical applications such as sales forecasting for business and many others. Though this problem has been extensively studied for years, it is still considered a challenging problem due to complex and evolving nature of time-series data. Typical methods proposed for time-series forecasting modelled linear or non-linear dependencies between data observations. However it is a generally accepted notion that no one method is universally effective for all kinds of time series data. Attempts have been made to use dynamic and weighted combination of heterogeneous and independent forecasting models and it has been found to be a promising direction to tackle this problem. This method is based on the assumption that different forecasters have different specialization and varying performance for different distribution of data and weights are dynamically assigned to multiple forecasters accordingly. However in many practical time-series dataset, the distribution of data slowly evolves with time. We propose a re-sampling based method to adjust the assigned weights to various forecasters to account for such distribution-drift. An exhaustive testing was performed against time-series data from several real-world applications. Experimental results show the competitiveness of this method against state-of-the-art approaches for combining forecasters.

Other authors
See publication
Surge-Adjusted Forecasting in Temporal Data Containing Extreme Observations

7th International Conference on Big Data Analysis and Data Mining
Forecasting in time-series data is at the core of various business decision making activities. One
key characteristic of many practical time series data of different business metrics such as orders,
revenue, is the presence of irregular yet moderately frequent spikes of very high intensity, called
extreme observation. Forecasting such spikes accurately is crucial for various business activities
such as workforce planning, financial planning, inventory planning. Traditional time…

Forecasting in time-series data is at the core of various business decision making activities. One
key characteristic of many practical time series data of different business metrics such as orders,
revenue, is the presence of irregular yet moderately frequent spikes of very high intensity, called
extreme observation. Forecasting such spikes accurately is crucial for various business activities
such as workforce planning, financial planning, inventory planning. Traditional time series
forecasting methods such as ARIMA, BSTS, are not very accurate in forecasting extreme spikes.
Deep Learning techniques such as variants of LSTM tend to perform only marginally better than
these traditional techniques. The underlying assumption of thin tail of data distribution is one of
the primary reasons for such models to falter on forecasting extreme spikes as moderately
frequent extreme spikes result in heavy tail of the distribution. On the other hand, literatures,
proposing methods to forecast extreme events in time series, focused mostly on extreme events
but ignored overall forecasting accuracy. We attempted to address both these problems by
proposing a technique where we considered a time series signal with extreme spikes as the
superposition of two independent signals - (1) a stationary time series signal without extreme
spike (2) a shock signal consisting of near-zero values most of the time along with few spikes of
high intensity. We modelled the above two signals independently to forecast values for the
original time series signal. Experimental results show that the proposed technique outperforms
existing techniques in forecasting both normal and extreme events.

Other authors
See publication

Projects

Data Imputation in Time Series containing Extreme Observations

Feb 2021 - Present

Imputed data in time series containing a lot of extreme observations using a bi-directional LSTM with an overall accuracy of 80%, offering an improvement of nearly 40% on the existing techniques like TS Impute, KNN, MICE etc.
China Walmart Orders Forecasting for Workforce Management

May 2019 - Present
Forecasted orders for China Walmart to aid in efficient workforce management using a number of classical and deep learning based models like Regression ARIMA, Bayesian Structural Time Series, Neutral Net etc. with an innovative additional adjustment made to account for abrupt upward fluctuations in the data. An overall accuracy of 95% was achieved across 600 different stores with roughly 100k orders being placed on a daily basis.

Other creators
Text Extraction from Images for Product Segmentation

Nov 2020 - Jan 2021

Worked on a text recognition model to identify inverted texts in images with an overall accuracy of 90% to segment products without labels. The features were extracted from the images using a CNN model based mainly on VGG-16 and then the extracted features were processed using an LSTM network to find out the text.
Cost Estimation based on Resource Usage by Users

-

Created a statistical model to estimate cost on the basis of resource usage by customers for running a number of queries with varying degrees of complexity. A constrained polynomial regression model was built for the purpose and achieved an overall accuracy of 85%.
Analysis of Effect of Noise on Accuracy of Neural Networks

Jan 2020 - Mar 2020
Analysed the effect of Gaussian Noise on the accuracy of deep learning based forecasting models and quantified how addition of noise actually helps in building a better neural network model.

Other creators
Creation of a Robust Measure of Forecast Accuracy Measurement

Nov 2019 - Jan 2020

Created a robust measure of forecast accuracy to quantify performance of time series models much more appropriately.
Contextual Vectorisation of Text for Improved Classifiability

Jun 2019 - Nov 2019
Calculated embeddings for text data on the basis of contexts in order to facilitate clustering of said data and help gauge customer sentiment better.

Other creators
Automated Data Pre-Processing

Dec 2018 - Jun 2019
Used PySpark to create modules to automatically pre-process data and perform a number of EDA operations including statistical tests and help users gain a detailed insight of the data.

Other creators
Categorical Feature Extraction

Aug 2018 - Dec 2018
Extracted latent variables from categorical data to facilitate clustering of such data in order to obtain better models.

Other creators
Regional Water Level Forecasting

Mar 2018 - Jun 2018

Forecasted regional water level using time series modelling techniques like ARIMA, TBATS, neural network, theta forecasting etc. with an overall 82% accuracy as a part of a Corporate Social Responsibility project to help the concerned authorities plan accordingly.
Drivers' Health Survey

Feb 2018 - Apr 2018

Worked on extracting and visually representing critical information from a drivers' health survey to help create improved safety guidelines and suggestions for a better lifestyle for them.
Prediction of NPS Scores

Jan 2018 - Feb 2018

Predicted NPS scores using relevant ML models like Random Forests, XGBoost, MARS and SVR to gauge customer feedback better.
Cycle Index for Economic Situation Forecasting

May 2017 - Jul 2017

Created a cycle index based on the forecasted values from the VAR model, using HP Filter and Principal Component Analysis, to predict potential economic stress scenarios.
Vector Autoregressive Model for Macroeconomic Factors Forecasting

May 2017 - Jul 2017

Built a Vector Autoregressive Model to forecast interconnected economic factors like GDP, HPI and Unemployment Rate.

Languages

English

Full professional proficiency
Hindi

Full professional proficiency
Bengali

Native or bilingual proficiency

More activity by Uddipto

"One should learn the essence of the scriptures from the Guru and then practice Sadhana. If one rightly follows spiritual discipline, then one…

"One should learn the essence of the scriptures from the Guru and then practice Sadhana. If one rightly follows spiritual discipline, then one…

Liked by Uddipto Dutta
Just like we have our home and love our home then we may need to go to various places for various purposes, but as soon as the work is done, we go…

Just like we have our home and love our home then we may need to go to various places for various purposes, but as soon as the work is done, we go…

Liked by Uddipto Dutta
Words! Drona managed to release an arrow that broke Dhrishtadyumna’s bow. Without wavering, Dhrishta picked up another bow & deeply pierced…

Words! Drona managed to release an arrow that broke Dhrishtadyumna’s bow. Without wavering, Dhrishta picked up another bow & deeply pierced…

Liked by Uddipto Dutta
सिद्धपीठ श्री सालासर बालाजी धाम से श्रीबालाजी महाराज के आज के अलौकिक दर्शन 04-10-2024 अश्विनी शुक्ल पक्ष शुक्रवार श्रीबालाजी महाराज की कृपा आप व आपके…

सिद्धपीठ श्री सालासर बालाजी धाम से श्रीबालाजी महाराज के आज के अलौकिक दर्शन 04-10-2024 अश्विनी शुक्ल पक्ष शुक्रवार श्रीबालाजी महाराज की कृपा आप व आपके…

Liked by Uddipto Dutta
To hear about Kṛṣṇa from Vedic literatures, or to hear from Him directly through the Bhagavad-gītā, is itself righteous activity. And for one who…

To hear about Kṛṣṇa from Vedic literatures, or to hear from Him directly through the Bhagavad-gītā, is itself righteous activity. And for one who…

Liked by Uddipto Dutta
Fight in divine consciousness! As Drona & Arjun waged an incomparable battle against each other, demigods, celestial sages, Gandharvas, Siddhas…

Fight in divine consciousness! As Drona & Arjun waged an incomparable battle against each other, demigods, celestial sages, Gandharvas, Siddhas…

Liked by Uddipto Dutta
जय हनुमान ज्ञान-गुन-सागर । जय कपीस तिहुँ लोक उजागर ॥🙏⛳🌺 #JaiHanuMan #JaiShreeRam 🙏⛳🌺

जय हनुमान ज्ञान-गुन-सागर । जय कपीस तिहुँ लोक उजागर ॥🙏⛳🌺 #JaiHanuMan #JaiShreeRam 🙏⛳🌺

Liked by Uddipto Dutta
Divine protection! As Ghatotkaca attacked & Kaurav soldiers fled, Karna did not waver & continued sending steady strem of arrows into the sky while…

Divine protection! As Ghatotkaca attacked & Kaurav soldiers fled, Karna did not waver & continued sending steady strem of arrows into the sky while…

Liked by Uddipto Dutta
Downward Spiral! After Krpā’s stern reply to Karna, he smilingly said: O Krpā, I agree with you that Sri Krsna & Arjun are ordinarily incapable of…

Downward Spiral! After Krpā’s stern reply to Karna, he smilingly said: O Krpā, I agree with you that Sri Krsna & Arjun are ordinarily incapable of…

Liked by Uddipto Dutta
Faults lies within if we see the same outside! As Dhrishtadyumna fought with Drona, Karna, Asvattāma, Salya & Duhsasana came and surrounded him…

Faults lies within if we see the same outside! As Dhrishtadyumna fought with Drona, Karna, Asvattāma, Salya & Duhsasana came and surrounded him…

Liked by Uddipto Dutta
Everyone has their own #story .🌟 Everyone knows their own #pain.❤️‍🩹 Never allow anyone to judge your path because only you know how much…

Everyone has their own #story .🌟 Everyone knows their own #pain.❤️‍🩹 Never allow anyone to judge your path because only you know how much…

Liked by Uddipto Dutta
Free from the Material world! Dhrishtadyumna confronted Asvattāma, challenging: O son of Drona, I will not kill you yet. Tomorrow I will kill your…

Free from the Material world! Dhrishtadyumna confronted Asvattāma, challenging: O son of Drona, I will not kill you yet. Tomorrow I will kill your…

Liked by Uddipto Dutta
The problem with distraction is that it leads to a frustrating non-presence in the present. The present moment is an opportunity for us to both…

The problem with distraction is that it leads to a frustrating non-presence in the present. The present moment is an opportunity for us to both…

Liked by Uddipto Dutta

View Uddipto’s full profile

See who you know in common
Get introduced
Contact Uddipto directly

Join to view full profile

Other similar profiles

Explore more posts

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Uddipto Dutta

Uddipto Dutta

Pega Certified Senior System Architect

Chantilly, VA

1 other named Uddipto Dutta is on LinkedIn

See others named Uddipto Dutta

Add new skills with these courses

See all courses

Uddipto Dutta

Bangalore Urban district, India 4K followers 500+ connections

About

Activity

Liked by Uddipto Dutta

🌟 Dream Big 🌟 Today, as I take the next step in my career and join Google as a Data Scientist, it still feels surreal. My journey from Mu Sigma…

Liked by Uddipto Dutta

Meditation is essentially about connecting with a timeless divine reality that resides and presides beyond the mundane reality, which is constantly…

Liked by Uddipto Dutta

Experience

-

-

-

-

-

-

-

Education

Licenses & Certifications

Certificate of Appreciation

Publications

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management October 30, 2021

24th International Conference on Computational Statistics (COMPSTAT 2021)

https://vixra.org/

24th International Conference on Computational Statistics (COMPSTAT 2021)

International Symposium on Forecasting (ISF)

7th International Conference on Big Data Analysis and Data Mining

Projects

Data Imputation in Time Series containing Extreme Observations

Feb 2021 - Present

China Walmart Orders Forecasting for Workforce Management

May 2019 - Present

Text Extraction from Images for Product Segmentation

Nov 2020 - Jan 2021

Cost Estimation based on Resource Usage by Users

-

Analysis of Effect of Noise on Accuracy of Neural Networks

Jan 2020 - Mar 2020

Creation of a Robust Measure of Forecast Accuracy Measurement

Nov 2019 - Jan 2020

Contextual Vectorisation of Text for Improved Classifiability

Jun 2019 - Nov 2019

Automated Data Pre-Processing

Dec 2018 - Jun 2019

Categorical Feature Extraction

Aug 2018 - Dec 2018

Regional Water Level Forecasting

Mar 2018 - Jun 2018

Drivers' Health Survey

Feb 2018 - Apr 2018

Prediction of NPS Scores

Jan 2018 - Feb 2018

Cycle Index for Economic Situation Forecasting

May 2017 - Jul 2017

Vector Autoregressive Model for Macroeconomic Factors Forecasting

May 2017 - Jul 2017

Languages

English

Full professional proficiency

Hindi

Full professional proficiency

Bengali

Native or bilingual proficiency

More activity by Uddipto

"One should learn the essence of the scriptures from the Guru and then practice Sadhana. If one rightly follows spiritual discipline, then one…

Liked by Uddipto Dutta

Just like we have our home and love our home then we may need to go to various places for various purposes, but as soon as the work is done, we go…

Liked by Uddipto Dutta

Words! Drona managed to release an arrow that broke Dhrishtadyumna’s bow. Without wavering, Dhrishta picked up another bow & deeply pierced…

Liked by Uddipto Dutta

Liked by Uddipto Dutta

To hear about Kṛṣṇa from Vedic literatures, or to hear from Him directly through the Bhagavad-gītā, is itself righteous activity. And for one who…

Liked by Uddipto Dutta

Fight in divine consciousness! As Drona & Arjun waged an incomparable battle against each other, demigods, celestial sages, Gandharvas, Siddhas…

Liked by Uddipto Dutta

जय हनुमान ज्ञान-गुन-सागर । जय कपीस तिहुँ लोक उजागर ॥🙏⛳🌺 #JaiHanuMan #JaiShreeRam 🙏⛳🌺

Liked by Uddipto Dutta

Divine protection! As Ghatotkaca attacked & Kaurav soldiers fled, Karna did not waver & continued sending steady strem of arrows into the sky while…

Liked by Uddipto Dutta

Downward Spiral! After Krpā’s stern reply to Karna, he smilingly said: O Krpā, I agree with you that Sri Krsna & Arjun are ordinarily incapable of…

Bangalore Urban district, India

4K followers 500+ connections