ML CA1 Ecommerce

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

E-Commerce Product Recommendations: Design a

recommendation system platform to suggest


products to users based on browsing and purchase
history
interested in the product. Similar
functionality can be extended to some
Abstract: extent for offline businesses. Here,
advanced machine learning techniques were
A product recommendation system is a
applied to the product dataset with excellent
machine learning application that suggests
results. The proposal system is designed in
products that users may purchase or engage
three parts which include product honesty-
with. The system uses machine learning
based prediction, client's buy history and
algorithms and data about different users
evaluations given by different clients who
and products to build a complex and
purchased similar things-based prediction
connected network between these products
and for an internet business site
and people. Featured offers, which the
interestingly with no item reviews. When
company predicts will be the most
another client with no past buy history visits
important items for a given client, show up
the e-shop's site interestingly, they are
on the landing page, item pages, truck,
suggested the most well-known items sold
checkout and even request affirmation
on the organization's site. When customer
pages. Based on data from previous clicks
makes a buy, the proposal framework
by customers. E-commerce businesses often
refreshes and suggests different items in
add lists of top-selling, new, and
view of procurement history and
recommended products to their lists. In
evaluations given by different clients on the
addition, the advancement of recommender
site. The subsequent part is finished
frameworks utilizing AI calculations
utilizing cooperative sifting strategies.
frequently deals with issues and brings up
issues that should be settled. In
recommender systems, when the customer
is on the product page, the new and Keywords: Machine Learning, Random
modified recommendation list will be Forest, Recommendations System,
displayed as related products or other Decision Tree, PCA, E-commerce.
customers also bought this if they are
their recommendations, they can do so
1. INTRODUCTION globally or locally [10]. "Globally" means
Today, with the development of digital that all users will receive the
technologies and the ease of access to the same recommendations. While "locally"
internet, we are more means that the recommended items will not
and more exposed to a wealth of be the same for all users.
information [1]. This development brings a Two approaches exist in the literature when
lot of diversity to users, but this developing a recommendation system using
multitude of information sources and collaborative
information overload can become filtering [11]. The first approach is
problematic [2]. In order to remedy this memory-based. That is, the system will
problem, various tools have been developed issue its recommendations based on a
to filter information before transmitting it to neighbourhood contained in memory.
users. Recommendation Whereas the second approach is rather
systems are tools that allow such filtering model based [12]. Thus, it is
[3]. The main goal of these systems is to necessary to first create models that
facilitate decision making for resemble the users' behaviours and then
users by offering them information use these models to issue
according to their preferences [4], [5]. recommendations. In practice, it has been
Generally, there are two main categories of shown that the memory-based approach
recommendation systems. One is done with offers better performance in
content-based terms of accuracy while the model-based
filtering and the other is done with approach is more efficient at large scale
collaborative filtering [6]. The content- with large data sets.
based approach is implemented by There are mainly two types of information
taking into account the characteristics of the that are used by recommender systems,
recommended products and creating groups implicit and explicit
of products using a information [13]. Explicit information is
similarity measure on their content [7]. translated as information provided by the
Content implies a relationship between a user. It is possible to provide
word and a product [8]. One of the main this information through the use of ratings,
drawbacks of this approach is that the wish lists, and comments. Unlike explicit
features are usually acquired using external information, implicit
information that is not information uses only information for
always available. On another note, which the user has not been asked. For
collaborative filtering typically uses a example, when a user buys a product,
neighbourhood of similar users and it creates a purchase history for that user. It
recommended products based on the history is possible for implicit information
of other users within the same recommendation systems to use
neighbourhood [9]. When systems issue this information to create different user
profiles.
using the fuzzy approach. The matrix
establishes the proximity's size.
An Apriori-Based Method to
I. LITERATURE REVIEW Product Placement in Order, by
Yusuke Ito and Shohei Kato (2016),
Proposed how to choose a ML algorithm has been used in a number of
for product recommendation by collecting information systems studies. In this
public study, it is stated that the method
applications and analysing them. It is used is very successful and very
proposed that decision tree and Bayesian effective, but this research was only
algorithms are usually applied in conducted on small-scale
recommendation system. warehouses, not on a large scale .
The research paper proposes how to The goal of picking is to make and
promote the product on social media for manage warehouse goods as easily
increasing the business performance by K- as possible, with the intention of
means machine learning algorithm for shortening the time in the collection
product recommendation but this system is of goods in the warehouse.
dependent on social media platform. How
to help internet retailers increase their III. METHODOLOGY
revenue by K-means machine learning
algorithm for product, but there is no The goal of this research on an e-
standard rule for choosing correct KPI. It is commerce product
recommendation system is to
proposed How to transform the amount of
provide customers with ideas using
blog articles and SSL certificate into search
machine learning techniques. We
engine traffic for product recommendation
have created a dataset from our e-
for increasing the business performance by
commerce site using customer
Fuzzy-set
behaviour this dataset contains user
Qualitative Comparative Analysis.
interaction for a certain product.
"Predicted performance," This system's After that we pre-process and filter
analysis aims to enhance performance this dataset. Then applied four
utilizing fuzzy association rules and better major machine learning algorithms
anticipate sales. To estimate sales by type (GNB, DT, RF, and LR) and others
of group for this investigation, data were on this dataset to build a
taken from an online store. Data is recommendation system model.
classified using modified clustering below gives an overview of the
techniques with fuzzy association rule architecture of the recommender
mining approach for retail based on system.
variables and associated equations
implementation. When overlapping and has
many clusters for a new item, grouping is
done on one object and put in one cluster
trees involved in this model's
decisionmaking process, it is known as an
ensemble of decision trees.

B. Decision Tree The decision


tree is a well-known machine
learning algorithm. It is utilized for
Dataset Description This dataset data classification. It is a tree-
includes the results gathered from a structured algorithm in which
customer survey for an e-commerce internal nodes and branches that
platform that was done using a Google indicate decision rules specify the
Form. The purpose of the survey was to characteristics of a database, with
learn more about consumers' preferences, each leaf node expressing the result.
experiences, and satisfaction with the The procedures listed below are
platform's goods and services. The dataset used to generate a decision tree.
offers insightful insights for enhancing the
online shopping experience and making C. Random Forest In machine learning, RF
data-driven business choices. It was built on is a supervise learning algorithm. Different
customer sentiment and contains data of decision trees are trained in this model
transactions. This dataset has 11 features using the dataset. Because there are many
like customer id, name, email, product different decision trees involved in this
model, product quantity, product price, model's decisionmaking process, it is
customer address, phone number, order known as an ensemble of decision trees.
date, order status and customer feedback
message. I had converted the textual data to
D. Principal Component Analysis
numeric values using the label encoding
Algorithm for unsupervised machine
approach to prepare for using machine
learning using principal components
learning algorithms. Here total 75% data are
(PCA). It is employed to lessen a dataset's
used for training the model and remainder
dimensionality. It is a statistical method
25% data are used for testing the model.
that uses orthogonal transformation to turn
A. Logistic Regression Logistic observations of correlated features into a
Regression uses classification to assess collection of linearly uncorrelated data. The
whether an input is benign or not. It is a Principal Components are the recently
machine learning approach. Other names modified features.
for the algorithm include logistic E. Performance parameters We measured
regression, log-linear classifier, and the performance of each model by
maximum-entropy classifier calculating Accuracy, Mean Absolute Error
(MaxEntRandom Forest In machine (MAE), Mean Squared Error (MSE), Root
learning, RF is a supervise learning Mean Squared Error (RMSE), and R-
algorithm. Different decision trees are Squared.
trained in this model using the dataset.
Because there are many different decision
F. Accuracy: One parameter for assessing provides you with an exact number
classification models is accuracy. The indicating how many your findings differ
percentage of predictions that our model from what you projected.
correctly predicted is known as accuracy.
The following is the official definition of I .Mean Absolute Error (MAE): Mean
accuracy: Accuracy = ( Number of correct Square Error and Mean Absolute Error are
predictions upon Total number of related terms (MSE). However, MAE takes
predictions ) the total of the error's absolute value rather
than its squared sum, as in MSE.
G: R Square/Adjusted R Square: R Square
measures how much of the variation in the
dependent variable the model can account
for. Its name, R Square, refers to the fact
that it is the square of the correlation
coefficient (R). RESULTS AND DISCUSSION All the
experimental obtained results for this work
are exhibited in this section in the tabular
form and the obtained results are analyzed
by performance evaluation parameters.
Four machine learning algorithms like RF,
R Square is calculated by dividing the
DT, GNB and LR can be used for required
entire sum of the squares that replace the
evaluation. The results are presented in the
calculated forecast with the mean by the
following figure. In this measurement, the
squared prediction error. A higher R Square
Random Forest (RF) gives highest accuracy
value denotes a better fit between the
99.8%, the Decision Tree (DT) exhibits
prediction and the actual value, which
96.3%, the Gaussian Naive Bayes (GNB)
ranges from 0 to 1. To evaluate how well
gives 44.4%, and the Logistic Regression
the model fits the dependent variables, use
(LR) shows 22.024%, the least accuracy.
R Square.
Shows the ML model's accuracy
performance.
H. Mean Square Error (MSE): Mean
Square Error is an absolute measurement of
the goodness of the fit, whereas R Square is
a relative indicator of how well the model
fits the dependent variables.

MSE is computed by adding together the ML model's R square performance.


squares of the real output and the
anticipated output, and dividing the result
by the total number of data points. It
In this measurement, the Random Forest ML model's MAE value.
(RF) achieves the highest value at 0.99,
followed by the Decision Tree (DT) at 0.96.
The Gaussian Naive Bayes (GNB) records
a value of -0.06, and the Logistic
Regression (LR) shows -1.82.

ML model's MSE value.

V. CONCLUSION Nowadays,
recommender systems (RS) are widely
utilized in social networks, ecommerce, and
various other fields. The integration of
machine learning (ML) methods, which
enable computers to learn from user input
and provide more personalized suggestions,
In this measurement, the Random Forest
marks a significant advancement in the
(RF) shows the best value at 0.3, followed
development of RS. This research explores
by the Decision Tree (DT) at 1.77. The
four machine learning approaches for
Gaussian Naive Bayes (GNB) gives a value
recommendation engines: Decision Tree
of 67.8, while the Logistic Regression (LR)
(DT), Gaussian Naive Bayes (GNB),
shows 180.81.
Random Forest (RF), and Logistic
Regression (LR).To optimize outcomes and
save time, the study employs the Principal
Component Analysis (PCA) method for
feature reduction. The performance of each
model is evaluated using multiple
assessment metrics, including accuracy, R-
square score, Mean Squared Error (MSE),
and Mean Absolute Error (MAE).
Experimental results indicate that the RF
algorithm outperforms the GNB, DT, and
LR algorithms, achieving the highest Method, article in Journal of Physics
accuracy. Data Availability Data generated Conference Series. 8. Ali, N.M.;
for this study is available from the Alshahrani, A.Alghamdi, A.M.; Novikov,
corresponding author on formal request. B. SmartTips: Online Products
REFERENCES 1. A Micu, A.E. Geru, Recommendations System Based on
Captina, A. and Muntean (2021). The Analyzing Customers Reviews. Appl. Sci.
Impact of Artificial Intelligence Use on the 2022, 12, 8823.
E-Commerce in Romania. Amfiteatru https://doi.org/10.3390/app12178823 9.
Economic, 23(56), pp. 137-154. Bhatia, L., & Prasad, S. S. (2015,
2. Qin Xu and Jun Wang. | 23 January February). Building a Distributed Generic
2022 | A Social-aware and Mobile Recommender Using Scalable Data Mining
Computing-based E-Commerce Product Library. In Computational Intelligence &
Recommendation System. Hindawi Communication Technology (CICT), 2015
Computational Intelligence and IEEE International Conference on (pp. 98-
Neuroscience Volume 2022, Article ID 102). IEEE.
9501246, 10. Borg, M. (2014, September). Embrace
3. Marius GERU, Angela Eliza MICU, your issues: compassing the software
Alexandru CAPATINA, Adrian MICU, | engineering landscape using bug reports. In
December 2018 | Using Artificial Proceedings of the 29th ACM/IEEE
Intelligence on Social Media’s User international conference on Automated
Generated Content for Disruptive software engineering (pp. 891-894). ACM.
Marketing Strategies in eCommerce. 4. 11. Bouneffouf, D., Bouzeghoub, A., &
Haris Ahmed, Dr. Tahseen Ahmed Jilani, Ganarski, A. L. (2013, January). Risk-
Waleej Haider, Mohammad Asad Abbasi, aware recommender systems. In Neural
Shardha Nand, Saher Kamran. | January Information Processing (pp. 57-65).
2017 | Establishing Standard Rules for Springer Berlin Heidelberg.
Choosing Best KPIs for an E-Commerce 12. Bouneffouf, D., Bouzeghoub, A., &
Business based on Google Analytics and Gançarski, A. L. (2012). Hybrid-ε-greedy
Machine Learning Technique. for mobile context-aware recommender
5. Ezhilarasan C and Ramani S 2017 system. In Advances in Knowledge
Performance Prediction using Modified Discovery and Data Mining (pp. 468-479).
Clustering Techniques with Fuzzy Springer Berlin Heidelberg.
Association Rule Mining Approach for 13. Burhams, D., & Kandefer, M. (2004).
Retail 2017 International Conference on Dustbot: Bringing Vacuum-Cleaner Agent
Intelligent Computing and Control. to Life. Accessible Hands-on Artificial
6. Ito Y and Kato S 2016 An Apriori-Based Intelligence and Robotics Education, 22-
Approach to Product Placement in Order 24.
Picking International Conference on 14. Chen, H., Tang, Y., Li, L., Yuan, Y., Li,
Agents. X., & Tang, Y. (2013). Error analysis of
7. C S Fatoni, E Utami and F W Wibowo. | stochastic gradient descent ranking.
December 2018 | Online Store Product Cybernetics, IEEE Transactions on, 43(3),
Recommendation System Uses Apriori 898-909.
15. Chen, L. S., Hsu, F. H., Chen, M. C., &
Hsu, Y. C. (2008). Developing
recommender systems with the
consideration of product profitability for
sellers. Information Sciences, 178(4), 1032-
1048.
16. Cui, Q., Bai, F. S., Gao, B., & Liu, T. Y.
(2015). Global Optimization for
Advertisement Selection in Sponsored
Search. Journal of Computer Science and
Technology, 30(2), 295-310.
17. Dean, J., & Ghemawat, S. (2008).
MapReduce: simplified data processing on
large clusters. Communications of the
ACM, 51(1), 107-113.
18. Egghe, L., & Leydesdorff, L. (2009).
The relation between Pearson's correlation
coefficient r and Salton's cosine measure.
Journal of the American Society for
information Science and Technology, 60(5),
1027-1036. 19. Ericson, K., & Pallickara,
S. (2011, December). On the performance
of distributed clustering algorithms in file
and streamingprocessing systems. In Utility
and Cloud Computing (UCC), 2011 Fourth
IEEE International Conference on (pp. 33-
40). IEEE. 20. Ericson, K., & Pallickara, S.
(2013). On the performance of high
dimensional data clustering and
classification algorithms. Future
Generation Computer Systems, 29(4),
1024-1034.
21. Micu, A Geru, & Micu, A-E. (2017).
Developing Customer Trust in E-
Commerce Using Inbound Marketing
Strategies. In S. Hugues, & N.Cristache
(eds.), Risk in Contemporary Economy (pp.
522-531).

You might also like